Minimally Supervised Morphological Analysis by Multimodal Alignment

نویسندگان

  • David Yarowsky
  • Richard Wicentowski
چکیده

This paper presents a corpus-based algorithm capable of inducing inflectional morphological analyses of both regular and highly irregular forms (such as brought→bring) from distributional patterns in large monolingual text with no direct supervision. The algorithm combines four original alignment models based on relative corpus frequency, contextual similarity, weighted string similarity and incrementally retrained inflectional transduction probabilities. Starting with no paired examples for training and no prior seeding of legal morphological transformations, accuracy of the induced analyses of 3888 past-tense test cases in English exceeds 99.2% for the set, with currently over 80% accuracy on the most highly irregular forms and 99.7% accuracy on forms exhibiting non-concatenative suffixation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ParaMor: Minimally Supervised Induction of Paradigm Structure and Morphological Analysis

Paradigms provide an inherent organizational structure to natural language morphology. ParaMor, our minimally supervised morphology induction algorithm, retrusses the word forms of raw text corpora back onto their paradigmatic skeletons; performing on par with state-ofthe-art minimally supervised morphology induction algorithms at morphological analysis of English and German. ParaMor consists o...

متن کامل

A Comparative Study on Minimally-Supervised Morphological Segmentation

This article presents a comparative study on a sub-field of morphology learning referred to as minimally-supervised morphological segmentation. In morphological segmentation, word forms are segmented into morphs, the surface forms of morphemes. In the minimally-supervised datadriven learning setting, segmentation models are learned from a small amount of manually annotated word forms and a larg...

متن کامل

Comparing minimally supervised home-based and closely supervised gym-based exercise programs in weight reduction and insulin resistance after bariatric surgery: A randomized clinical trial

    Background: Effectiveness of various exercise protocols in weight reduction after bariatric surgery has not been sufficiently explored in the literature. Thus, in the present study, we aimed at comparing the effect of minimally supervised home-based and closely supervised gym-based exercise programs on weight reduction and insulin resistance after bariatric surgery.  &n...

متن کامل

A Comparative Study of Minimally Supervised Morphological Segmentation

This article presents a comparative study of a subfield of morphology learning referred to as minimally supervised morphological segmentation. In morphological segmentation, word forms are segmented into morphs, the surface forms of morphemes. In the minimally supervised data-driven learning setting, segmentation models are learned from a small number of manually annotated word forms and a larg...

متن کامل

Minimally supervised lemmatization scheme induction through bilingual parallel corpora

We present a lemma induction scheme on a target language through minimally supervised alignment and transfer methods utilizing English-to-German parallel corpora. Compared to previous alignment and transfer approaches, the approach outlined here increases computational efficiency and significantly reduces the level of supervision necessary in inducing clusters of inflectional forms. Furthermore...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000